Scalable multiagent learning through indirect encoding of policy geometry

نویسندگان

  • David B. D'Ambrosio
  • Kenneth O. Stanley
چکیده

Multiagent systems present many challenging, real-world problems to artificial intelligence. Because it is difficult to engineer the behaviors of multiple cooperating agents by hand, multiagent learning has become a popular approach to their design. While there are a variety of traditional approaches to multiagent learning, many suffer from increased computational costs for large teams and the problem of reinvention (that is, the inability to recognize that certain skills are shared by some or all team meManumber). This paper presents an alternative approach to multiagent learning called multiagent HyperNEAT that represents the team as a pattern of policies rather than as a set of individual agents. The main idea is that an agent’s location within a canonical team layout (which can be physical, such as positions on a sports team, or conceptual, such as an agent’s relative speed) tends to dictate its role within that team. This paper introduces the term policy geometry to describe this relationship between role and position on the team. Interestingly, such patterns effectively represent up to an infinite number of multiagent policies that can be sampled from the policy geometry as needed to allow training very large teams or, in some cases, scaling up the size of a team without additional learning. In this paper, multiagent HyperNEAT is compared to a traditional learning method, multiagent Sarsa(λ ), in a predator-prey domain, where it demonstrates its ability to train large teams.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evolving policy geometry for scalable multiagent learning

A major challenge for traditional approaches to multiagent learning is to train teams that easily scale to include additional agents. The problem is that such approaches typically encode each agent’s policy separately. Such separation means that computational complexity explodes as the number of agents in the team increases, and also leads to the problem of reinvention: Skills that should be sh...

متن کامل

Scalable Heterogeneous Multiagent Teams Through Learning Policy Geometry DISTRIBUTION A. Approved for public release: distribution unlimited

In Phase I of the DARPA CSSG we developed an early version of a new algorithm for training multiple robotic agents to coordinate with each other called multiagent HyperNEAT (D’Ambrosio and Stanley 2008). This approach built upon Hypercube-based NeuroEvolution of Augmenting Topologies (HyperNEAT), a new algorithm for evolving artificial neural networks that we had introduced shortly before (D’Am...

متن کامل

Evolving Static Representations for Task Transfer

An important goal for machine learning is to transfer knowledge between tasks. For example, learning to play RoboCup Keepaway should contribute to learning the full game of RoboCup soccer. Previous approaches to transfer in Keepaway have focused on transforming the original representation to fit the new task. In contrast, this paper explores the idea that transfer is most effective if the repre...

متن کامل

A Multiagent Reinforcement Learning algorithm to solve the Community Detection Problem

Community detection is a challenging optimization problem that consists of searching for communities that belong to a network under the assumption that the nodes of the same community share properties that enable the detection of new characteristics or functional relationships in the network. Although there are many algorithms developed for community detection, most of them are unsuitable when ...

متن کامل

Scalable Planning and Learning for Multiagent POMDPs: Extended Version

Online, sample-based planning algorithms for POMDPs have shown great promise in scaling to problems with large state spaces, but they become intractable for large action and observation spaces. This is particularly problematic in multiagent POMDPs where the action and observation space grows exponentially with the number of agents. To combat this intractability, we propose a novel scalable appr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Evolutionary Intelligence

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2013